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Rust is a new systems programming language that promises to overcome the seemingly fundamental tradeoff 
between high-level safety guarantees and low-level control over resource management. Unfortunately, none 
of Rust’s safety claims have been formally proven, and there is good reason to question whether they actually 
hold. Specifically, Rust employs a strong, ownership-based type system, but then extends the expressive power 
of this core type system through libraries that internally use unsafe features. In this paper, we give the first 
formal (and machine-checked) safety proof for a language representing a realistic subset of Rust. Our proof is 
extensible in the sense that, for each new Rust library that uses unsafe features, we can say what verification 
condition it must satisfy in order for it to be deemed a safe extension to the language. We have carried out 
this verification for some of the most important libraries that are used throughout the Rust ecosystem. 
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1 INTRODUCTION 


Systems programming languages like C and C++ give programmers low-level control over resource 
management at the expense of safety, whereas most other modern languages give programmers safe, 
high-level abstractions at the expense of control. It has long been a “holy grail” of programming 
languages research to overcome this seemingly fundamental tradeoff and design a language that 
offers programmers both high-level safety and low-level control. 

Rust [Matsakis and Klock II 2014; Rust team 2017], developed at Mozilla Research, comes closer 
to achieving this holy grail than any other industrially supported programming language to 
date. On the one hand, like C++, Rust supports zero-cost abstractions for many common systems 
programming idioms and avoids dependence on a garbage collector [Stroustrup 2012; Turon 
2015a]. On the other hand, like most modern high-level languages, Rust is type-safe and memory- 
safe. Furthermore, Rust’s type system goes beyond that of the vast majority of safe languages 
in that it statically rules out data races (which are a form of undefined behavior for concurrent 
programs in many languages like C++ or Rust), as well as common programming pitfalls like 
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iterator invalidation [Gregor and Schupp 2003]. In other words, compared to mainstream “safe” 
languages, Rust offers both lower-level control and stronger safety guarantees. 

At least, that is the hope. Unfortunately, none of Rust’s safety claims have been formally proven, 
and there is good reason to question whether they actually hold. In this paper, we make a major 
step toward rectifying this situation by giving the first formal (and machine-checked) safety proof 
for a language representing a realistic subset of Rust. Before explaining our contributions in more 
detail, and in particular what we mean here by “realistic”, let us begin by exploring what makes 
Rust’s type system so unusual, and its safety so challenging to verify. 


1.1. Rust’s “Extensible” Approach to Safe Systems Programming 


At the heart of the Rust type system is an idea that has emerged in recent years as a unifying concept 
connecting both academic and mainstream language design: ownership. In its simplest form, the idea 
of ownership is that, although multiple aliases to a resource may exist simultaneously, performing 
certain actions on the resource (such as reading and writing a memory location) should require 
a “right” or “capability” that is uniquely “owned” by one alias at any point during the execution 
of the program. Although the right is uniquely owned, it can be “transferred” from one alias to 
another—e.g., upon calling a function or spawning a thread, or via synchronization mechanisms 
like locks. In more complex variations, ownership can be shared between aliases, but only in a 
controlled manner (e.g., shared ownership only permits read access [Boyland 2003]). In this way, 
ownership allows one to carefully administer the safe usage of potentially aliased resources. 

Ownership pervades both academic and mainstream language design for safe(r) systems pro- 
gramming. On the academic side, many proposals have been put forth for using types to enforce 
various ownership disciplines, including “ownership type” systems [Clarke et al. 1998]; region- or 
typestate-based systems for “safe C” programming in languages like Cyclone [Jim et al. 2002] and 
Vault [DeLine and Fahndrich 2001]; and substructural type systems like Ynot [Nanevski et al. 2008], 
Alms [Tov and Pucella 2011], and Mezzo [Balabonski et al. 2016]. Unfortunately, although these 
languages provide strong safety guarantees, none of them have made it out of academic research 
into mainstream use. On the mainstream side, “modern C++” (i.e., C++ since the 2011 standard [ISO 
Working Group 21 2011]) provides several features—e.g., smart pointers, move semantics, and RAII 
(Resource Acquisition Is Initialization)—that are essentially mechanisms for controlling ownership. 
However, while these features encourage safer programming idioms, the type system of C++ is too 
weak to enforce its ownership disciplines statically, so it is still easy to write programs with unsafe 
or undefined behavior using these features. 

In some sense, the key challenge in developing sound static enforcement of ownership disciplines— 
and the reason perhaps that academic efforts have not taken off in practice—is that no language 
can account for the safety of every advanced form of low-level programming that one finds in 
the wild, because there is no practical way to do so while retaining automatic type checking. 
As a result, previous designs employ type systems that are either too restrictive (i.e., preventing 
programmers from writing certain kinds of low-level code they want to write) or too expressive 
(i.e., encoding such a rich logic in the type structure that programmers must do proofs to appease 
the type checker). 

Rust addresses this challenge by taking a hybrid, extensible approach to ownership. 

The basic ownership discipline enforced by Rust’s type system is a simple one: If ownership 
of an object (of type T) is shared between multiple aliases (“shared references” of type &T), then 
none of them can be used to directly mutate it. This discipline, which is similar in spirit to (if 
different in detail from) that of several prior academic approaches, is enforceable automatically and 
eliminates a wide range of common low-level programming errors, such as “use after free”, data 
races, and iterator invalidation. However, it is also too restrictive to account for many low-level 
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data structures and synchronization mechanisms, which fundamentally depend on the ability to 
mutate aliased state (e.g., to implement mutual exclusion or communication between threads). 

Consequently, to overcome this restriction, the implementations of Rust’s standard libraries 
make widespread use of unsafe operations, such as “raw pointer” manipulations for which aliasing 
is not tracked. The developers of these libraries claim that their uses of unsafe code have been 
properly “encapsulated”, meaning that if programmers make use of the APIs exported by these 
libraries but otherwise avoid the use of unsafe operations themselves, then their programs should 
never exhibit any unsafe/undefined behaviors. In effect, these libraries extend the expressive power 
of Rust’s type system by loosening its ownership discipline on aliased mutable state in a modular, 
controlled fashion: Even though a shared reference of type &T may not be used to directly mutate 
the contents of the reference, it may nonetheless be used to indirectly mutate them by passing it to 
one of the observably “safe” (but internally unsafe) methods exported by the object’s API. 

However, there is cause for concern about whether Rust’s extensible approach is actually sound. 
Over the past few years, several soundness bugs have been found in Rust, both in the type system 
itself [Ben-Yehuda 2015a,b; Turon 2015b] and in libraries that use unsafe code [Ben-Yehuda 2015c; 
Biocca 2017; Jung 2017]. Some of these—such as the Leakpocalypse bug [Ben-Yehuda 2015c]—are 
quite subtle in that they involve an interaction of multiple libraries, each of which is (or seems to 
be) perfectly safe on its own. To make matters worse, the problem cannot easily be contained by 
blessing a fixed set of standard libraries as primitive and just verifying the soundness of those; 
for although it is considered a badge of honor for Rust programmers to avoid the use of unsafe 
code entirely, many nevertheless find it necessary to employ a sprinkling of unsafe code in their 
developments. Of course, it is not unusual for safe languages to provide unsafe escape hatches 
(e.g., Haskell’s unsafePerform10, OCaml’s Obj .magic) to work around limitations of their type 
systems. But unsafe code plays such a fundamental role in Rust’s extensible ownership discipline 
that it cannot simply be swept aside if one wishes to give a realistic formal account of the language. 

The question remains: How can we verify that Rust’s extensible approach makes any sense? The 
standard technique for proving safety properties for high-level programming languages—namely, 
“progress and preservation” introduced by Wright and Felleisen [1994]—does not apply to languages 
in which one can mix safe and unsafe code. (Progress and preservation is a closed-world method, 
which assumes the use of a closed set of typing rules. This assumption is fundamentally violated by 
Rust’s extensible, open-world approach.) So, to account for safe-unsafe interaction, we need a way 
to specify formally what we are obliged to prove if we want to establish that a library employing 
unsafe code constitutes a sound extension of the Rust type system. Luckily, decades of research in 
semantics and verification have provided us with just the right tools for the job. 


1.2 RustBelt: An Extensible, Semantic Approach to Proving Soundness of Rust 


In this paper, we give the first formal (and machine-checked) account of Rust’s extensible approach 
to safe systems programming and how to prove it sound. 

For obvious reasons of scale, we do not consider the full Rust language, for which no formal 
description exists anyway. Instead, after beginning (in §2) with an example-driven tour of the 
most central and distinctive features of the Rust type system, we proceed (in §3) to describe 
Arust, a continuation-passing style language (of our own design) that formalizes the static and 
dynamic semantics of these central features. Crucially, Agust incorporates Rust’s notions of borrowing, 
lifetimes, and lifetime inclusion—which are fundamental to Rust’s ownership discipline—in a manner 
inspired by Rust’s Mid-level Intermediate Representation (MIR). For simplicity, Agust omits some 
orthogonal features of Rust such as traits (which are akin to Haskell type classes); it also avoids the 
morass of exciting complications concerning relaxed memory, instead adopting a simplified memory 
model featuring only non-atomic and sequentially consistent atomic operations. Nevertheless, Apgust 
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is realistic enough that studying it led us to uncover a previously unknown soundness bug in Rust 
itself [Jung 2017]. 

Our core contribution is then to develop an extensible soundness proof for Agust. The basic idea is 
to build a semantic model of the language—in particular, a logical relation [Plotkin 1973; Tait 1967]. 
The idea of proving soundness semantically is hardly new [Milner 1978], but it fell out of favor 
after Wright and Felleisen [1994] developed their simpler “syntactic” proof method. The semantic 
approach to type soundness is more powerful than the syntactic approach, however, because it 
offers an interpretation of what types mean (i.e., what terms inhabit them) that is more general 
than just “what the syntactic typing rules allow”—it describes when it is observably safe to treat a 
term as having a certain type, even if syntactically that term employs unsafe features. Moreover, 
thanks to the Foundational Proof-Carrying Code project [Ahmed et al. 2010; Appel 2001] and the 
development of “step-indexed” logical relations [Ahmed 2004; Appel and McAllester 2001] which 
arose from that project, we now know how to scale the semantic approach to languages with 
semantically complex features like recursive types and higher-order state. 

Here, we follow the style of recent “logical” accounts of step-indexed logical relations [Dreyer 
et al. 2011, 2010; Krogh-Jespersen et al. 2017; Turon et al. 2013], interpreting Apust types as predicates 
on values expressed in a rich program logic (see §4 and Challenge #1 below), and interpreting 
Arust typing judgments as logical entailments between these predicates (see §7). With our semantic 
model—which we call RustBelt—in hand, the proof of safety of Agus_ divides into three parts: 


(1) Verify that the typing rules of Apust are sound when interpreted semantically, i.e., as lemmas es- 
tablishing that the semantic interpretations of the premises imply the semantic interpretation 
of the conclusion. This is called the fundamental theorem of logical relations. 

(2) Verify that, ifa closed program is semantically well-typed according to the model, its execution 
will not exhibit any unsafe/undefined behaviors. This is called adequacy. 

(3) For any library that employs unsafe code internally, verify that its implementation satisfies 
the predicate associated with the semantic interpretation of its interface, thus establishing 
that the unsafe code has indeed been safely “encapsulated” by the library’s API. In essence, 
the semantic interpretation of the interface yields a library-specific verification condition. 


Together, these ensure that, so long as the only unsafe code in a well-typed Apust program is 
confined to libraries that satisfy their verification conditions, the program is safe to execute. 

This proof is “extensible” in the sense, that whenever you have a new library that uses unsafe 
code and that you want to verify as being safe to use in Rust programs, RustBelt tells you the 
verification condition you need to prove about it. Using the Coq proof assistant [Coq team 2017], 
we have formally proven the fundamental theorem and adequacy once and for all, and we have 
also proven the verification conditions for (Agust ports of) several standard Rust libraries that 
use unsafe code, including Arc, Rc, Cell, RefCell, Mutex, RwLock, mem: : swap, thread: : spawn, 
rayon: : join, and take_mut. 

Although the high-level structure of our soundness proof is standard [Ahmed 2004; Milner 1978], 
developing such a proof for a language as subtle and sophisticated as Rust has required us to tackle 
a variety of technical challenges, more than we can describe in the space of this paper. To focus the 
presentation, we will therefore not present all these challenges and their solutions in full technical 
detail (although further details can be found in our technical appendix and Coq development [Jung 
et al. 2017a]). Rather, we aim to highlight the following key challenges and how we dealt with them. 


Challenge #1: Choosing the right logic for modeling Rust. The most fundamental design 
choice in RustBelt was deciding which logic to use as its target, i.e. for defining semantic interpre- 
tations of Rust types. There are several desiderata for such a logic, but the most important is that it 
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should support high-level reasoning about concepts that are central to Rust’s type system, such as 
ownership and borrowing. The logic we chose, Iris, is ideally suited to this purpose. 

Iris is a language-generic framework for higher-order concurrent separation logic [Jung et al. 
2016, 2017b, 2015; Krebbers et al. 2017a], which in the past year has been equipped with tactical 
support for conducting machine-checked proofs of programs in Coq [Krebbers et al. 2017b] and 
deployed in several ongoing verification projects [Kaiser et al. 2017; Swasey et al. 2017; Tassarotti 
et al. 2017; Timany et al. 2018]. By virtue of being a separation logic [O’Hearn 2007; Reynolds 
2002], Iris comes with built-in support for reasoning modularly about ownership. Moreover, the 
main selling point of Iris is its support for deriving custom program logics for different domains 
using only a small set of primitive mechanisms (namely, higher-order ghost state and impredicative 
invariants). In the case of RustBelt, we used Iris to derive a novel lifetime logic, whose primary 
feature is a notion of borrow propositions that mirrors the “borrowing” mechanism for tracking 
aliasing in Rust. This lifetime logic, which we describe in some detail in §5, has made it possible for 
us to give fairly direct interpretations of a number of Rust’s most semantically complex types, and 
to verify their soundness at a high level of abstraction. 


Challenge #2: Modeling Rust’s extensible ownership discipline. As explained above, a 
distinctive feature of Rust is its extensible ownership discipline: Owning a value of shared reference 
type &T confers different privileges depending on the type T. For many simple types, &T confers 
read-only access to the contents of the reference; but for types defined by libraries that use unsafe 
operations, &T may in fact confer mutable access to the contents, indirectly via the API of T. In Rust 
lingo, this phenomenon is termed interior mutability. 

To model interior mutability, RustBelt interprets types T in two ways: (1) with an ownership 
predicate that says what it means to own a value of type T, and (2) with a sharing predicate that 
says what it means to own a value of type &T. Unlike the ownership predicate, the sharing predicate 
must be a freely duplicable assertion, since Rust allows values of shared reference type to be freely 
copied. But otherwise there is a great deal of freedom in how it is defined, thus allowing us to assign 
very different semantics to &T for different types T. We exploit this freedom in proving semantic 
soundness of several Rust libraries whose types exhibit interior mutability (see §6). 


Challenge #3: Accounting for Rust’s “thread-safe” type bounds. Some of Rust’s types 
that exhibit interior mutability use non-atomic rather than atomic memory accesses to improve 
performance. As a result, however, they are not “thread-safe”, meaning that if one could transfer 
ownership of values of these types between threads, it could cause a data race. Rust handles this 
potential safety problem by restricting cross-thread ownership transfer to types that satisfy certain 
type bounds: the Send bound classifies types T that are thread-safe, and the Sync bound classifies 
types T such that &T is thread-safe. 

We account for these type bounds in RustBelt in a simple and novel way. First, we parameterize 
both the ownership and sharing predicates in the semantics of types by a thread identifier, rep- 
resenting the thread that is claiming ownership. We then define T to be Send if T’s ownership 
predicate does not depend on the thread id parameter (and Sync if T’s sharing predicate does not 
depend on the thread id parameter). Intuitively, this makes sense because, if ownership of a value v 
of type T is thread-independent, transferring ownership of v between threads is perfectly safe. 


All results in this paper have been fully formalized in the Coq proof assistant [Jung et al. 2017a]. 


2 A TOUR OF RUST 


In this section, we give a brief overview of some of the central features of the Rust type system. We 
do not assume the reader has any prior familiarity with Rust. 
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2.1. Ownership Transfer 


A core feature of the Rust type system is to provide thread-safety, i.e., to guarantee the absence of 
unsynchronized race conditions. Race conditions can only arise from an unrestricted combination 
of aliasing and mutation on the same location. In fact, it turns out that ruling out mutation of 
aliased data also prevents other errors commonplace in low-level pointer-manipulating programs, 
like use-after-free or double-free. The essential idea of the Rust type system is thus to ensure that 
aliasing and mutation cannot occur at the same time on any given location, which it achieves by 
letting types represent ownership. 

Let us begin with the most basic form of ownership, exclusive ownership, in which, at any 
time, at most one thread is allowed to mutate a given location. Exclusive ownership rules out 
aliasing entirely, and thus prevents data races. However, just exclusive ownership would not be 
very expressive, and therefore Rust allows one to transfer ownership between threads. To see this 
principle in practice, consider the following sample program: 


1 let (snd, rcv) = channel(); 

2 join(move || { 

3 let mut v = Vec::new(); v.push(@); // v: Vec<i32> 
4 snd.send(v); 

5 // Cannot access v: v.push(1) rejected 

6 ts 

7 move || { 

8 let v = rcv.recv().unwrap(); // v: Vec<i32> 

9 println!("Received: {:?}", v); 

10 ); 


Before we take a detailed look at the way the Rust type system handles ownership here, we 
briefly discuss syntax: let is used to introduce local, stack-allocated variables. These can be made 
mutable by using let mut. The first line uses a pattern to immediately destruct the pair returned 
by channel() into its components. The vertical bars | | mark the beginning of an anonymous 
closure; if the closure would take arguments, they would be declared between the bars. 

In this example, one thread sends a shallow copy (i.e., not duplicating data behind pointer 
indirections) of a vector v of type Vec<i32> (a resizable heap-allocated array of 32-bit signed 
integers) over a channel to another thread. In Rust, having a value of some type indicates that we 
are the exclusive owner of the data described by said type, and thus that nobody else has any kind 
of access to this array, i.e., no other part of the program can write to or even read from the array. 
When ownership is passed to a function (e.g., send), the function receives a shallow copy of the 
data.! At the same time, ownership of the data is considered to have moved, and thus no longer 
available in the callee—thus, Rust’s variable context is substructural. This is important because the 
receiver only receives a shallow copy, so if both threads were to use the vector, they could end up 
racing on the same data. 

The function channel creates a typed multi-producer single-consumer channel and returns the 
two endpoints as a pair. The function join is essentially parallel composition; it takes two closures 
and executes them in parallel, returning when both are done.” The keyword move instructs the 
type checker to move exclusive ownership of the sending end snd and receiving end rcv of the 
channel into the first, and, respectively, second closure. 

In this example, the first thread creates a new empty Vec, v, and pushes an element onto it. Next, 
it sends v over the channel. The send function takes type Vec<i32> as argument, so the Rust type 


1Of course, Rust provides a way to do a deep copy that actually duplicates the vector, but it will never do this implicitly. 
2 join is not in the Rust standard library, but part of Rayon [Stone and Matsakis 2017], a library for parallel list processing. 
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checker considers v to be moved after the call to send. Any further attempts to access v would thus 
result in a compile-time error. The second thread works on the receiving end of the channel. It uses 
recy in order to receive (ownership of) the vector. However, recv is a fallible operation, so we call 
unwrap to trigger a panic (which aborts execution of the current thread) in case of failure. Finally, 
we print a debug representation of the vector (as indicated by the format string "{:?}"). 

One aspect of low-level programming that is distinctively absent in the code above is memory 
management. Rust does not have garbage collection, so it may seem like our example program 
leaks memory, but that is not actually the case: Due to ownership tracking, Rust can tell when a 
variable (say, the vector v) goes out of scope without having been moved elsewhere. When that is 
the case, the compiler automatically inserts calls to a destructor, called drop in Rust. For example, 
when the second thread finishes in line 10, v is dropped. Similarly, the sending and receiving ends 
of the channel are dropped at the end of their closures. This way, Rust provides automatic memory 
management without garbage collection, and with predictable runtime behavior. 


2.2 Mutable References 


Ownership transfer is a fairly straightforward mechanism for ensuring data-race freedom and 
related memory safety properties. However, it is also very restrictive. In fact, close inspection shows 
that even our first sample program does not strictly follow this discipline. Observe that in line 3, we 
are calling the method push on the vector v—and we keep using v afterwards. Indeed, it would be 
very inconvenient if pushing onto a vector required explicitly passing ownership to push and back. 
Rust’s solution to this issue is borrowing, which is the mechanism used to handle reference types. 
The idea is that v is not moved to push, but instead borrowed, i.e., passed by reference—granting 
push access to v for the duration of the function call. 

This is expressed in the type of push: fn(&mut Vec<i32>, i32) -> (). (Henceforth, we follow 
the usual Rust style and omit the return type if it is the unit type ().) The syntax v. push(), as 
used in the example, is just syntactic sugar for Vec::push(&mut v, 0), where &mut v creates 
a mutable reference to v, which is then passed to push. A mutable reference grants temporary 
exclusive access to the vector, which in the example means that access is restricted to the duration 
of the call to push. Because the access is temporary, our program can keep using v when push 
returns. Moreover, the exclusive nature of this access guarantees that no other party will access the 
vector in any way during the function call, and that push cannot keep copies of the pointer to the 
vector. Mutable references are always unique pointers. 

The type of send, fn(&mut Sender<Vec<i32>>, Vec<i32>), shows another use of mutable 
references. The first argument is just borrowed, so the caller can use the channel again later. In 
contrast, the second argument is moved, using ownership transfer as already described above. 


2.3. Shared References 


Rust’s approach to guaranteeing the absence of races and other memory safety is to rule out the 
combination of aliasing and mutation. So far, we have seen unique ownership (§2.1) and (borrowed) 
mutable references (§2.2), both of which allow for mutation but prohibit aliasing. In this section 
we discuss another form of references, namely shared references, which form the dual to mutable 
references: They allow aliasing but prohibit mutation. 

Like mutable references, shared references grant temporary access to a data structure, and opera- 
tionally correspond to just pointers. The difference is in the guarantees and permissions provided 
to the receiver of the reference. While mutable references are exclusive (non-duplicable), shared 
references can be duplicated. In other words, shared references permit aliasing. As a consequence, 
to ensure data-race freedom and memory safety, shared references are read-only. 

Practically speaking, shared references behave like unrestricted variables in linear type systems, 
ie., just like integers, they can be “copied” (as opposed to just being “moved”, which is possible 
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with variables of all types). Rust expresses such properties of types using bounds, and the bound 
that describes unrestricted types is called Copy. Specifically, if a type is Copy, it means that doing a 
shallow copy (which, remember, is what Rust does to pass arguments) suffices to duplicate elements 
of the type. Both &T and i32 are Copy (for any T)—however, Vec<i32> is not! The reason for this is 
that Vec<i32> stores data on the heap, and a shallow copy does not duplicate this heap data. 

We can see shared references in action in the following example: 


1 let mut v = Vec::new(); v.push(1); 
2 join(|| println!("Thread 1: {:?}", &v), || println!("Thread 2: {:?}", &v)); 
3 v.push(2); 


This program starts by creating and initializing a vector v. It uses a shared reference &v to the 
vector in two threads, which concurrently print the contents of the vector. This time, the closures 
are not marked as move, which leads to v being captured by-reference, i.e., at type &Vec<i32>. As 
discussed above, this type is Copy, so the type checker accepts using &v in both threads. 

The concurrent accesses to v use non-atomic reads, which have no synchronization. This is 
safe because when a function holds a shared reference, it can rely on the data-structure not being 
mutated—so there cannot be any data races. (Notice that this is a much stronger guarantee than 
what C provides with const pointers: In C, const pointers prevent mutation by the current function, 
however, they do not rule out mutation by other functions.) 

Finally, when join returns, the example program re-gains full access to the vector v and can 
mutate v again in line 3. This is safe because join will only return when both threads have finished 
their work, so there cannot be a race between the push and the println. This demonstrates that 
shared references are powerful enough to temporarily share a data structure and permit unrestricted 
copying of the pointer, but regain exclusive access later. 


2.4 Lifetimes 


As previously explained, (mutable and shared) references borrow ownership and thus grant tempo- 
rary access to a data structure. This immediately raises the question: “How long is temporary?” In 
Rust, this question is answered by equipping every reference with a lifetime. The full form of a 
reference type is actually &'a mut T or &'a T, where ‘a is the lifetime of the reference. Rust uses 
a few conventions so that lifetimes can be elided in general, which is why they did not show up in 
the programs and types we considered so far. However, lifetimes play a crucial role in explaining 
what happens when the following function is type-checked: 





1 ffn example(v: &/* 'a */mut Vec<i32>) { 

2: fv.push(21); ——SsCLifetime ‘c: 

4 Tier wk heed ean 1S ae 

4 // Cannot access v: v.push(2) rejected 

5 _ *head = 23; } Lifetime 'b : 

7 perntinC Ler ws 77 Prints: (23: c..;. 42] 

8 3 Lifetime ‘a : 


Here we define a function example that takes an argument of type &mut Vec<i32>. The function 
uses index_mut to obtain a pointer to the first element inside the vector. Writing to head in line 5 
changes the first element of the vector, as witnessed by the output in line 7. Such pointers directly 
into a data structure are sometimes called deep or interior pointers. One has to be careful when using 
deep pointers because they are a form of aliasing: When v is deallocated, head becomes a dangling 
pointer. In fact, depending on the data structure, any modification of the data structure could lead 
to deep pointers being invalidated. (One infamous instance of this issue is iterator invalidation, 
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troubling not only low-level languages like C++, but also safe languages like Java.) This is why the 
call to push in line 4 is rejected. 

How does Rust manage to detect this problem and reject line 4 above? To understand this, we have 
to look at the type of index_mut: for<'b> fn(&'b mut Vec<i32>, usize) -> &'b mut i32.° 
The for is a universal quantifier, making index_mut generic in the lifetime 'b. The caller can use 
any 'b to instantiate this generic function, limited only by an implicit requirement that 'b must 
last at least as long as the call to index_mut. Crucially, 'b is used for both the reference passed to 
the function and the reference returned. 

In our example, Rust has to infer the lifetime 'b left implicit when calling index_mut. Because 
the result of index_mut is stored in head, the type checker infers 'b to be the scope of head, i.e., 
lines 3-5. As a consequence, based on the type of index_mut, the vector must be borrowed for the 
same lifetime. So Rust knows that v is mutably borrowed for lines 3-5, which makes the access in 
line 4 invalid: The lifetime of the reference needed by push would overlap with the lifetime of the 
reference passed to index_mut, which violates the rule that mutable references must be unique. 

Lifetimes were not visible in the examples discussed so far, but they are always present implicitly. 
For example, the full type of push is given by for<'c> push(&'c mut Vec<i32>, i32).The type 
checker thus has the freedom to pick any lifetime for the reference to the vector, constrained only 
by the implicit requirement that 'c has to cover at least the duration of the function call. This is 
why the vector can be used again immediately after push returned. 

Notice that, unlike in the previous examples, v in this example is just a mutable reference to begin 
with. Just like push, the type of example actually involves a generic lifetime ‘a, and v has type 
&'a mut Vec<i32>. Despite not being the original owner of v, we can still borrow v to someone 
else—a phenomenon dubbed reborrowing. All we have to check is that the reborrow ends before 
the lifetime of our reference ends. In other words, the lifetime of the reborrow (the 'b used for 
index_mut, ie. the scope of head) has to be included in the lifetime of the reference (' a). In this 
case, we know this to be true by making use of the implicit assumption that ' a includes this function 
call, so in particular, it includes 'b, which is entirely contained within the function call. 


2.5 Interior Mutability 


So far, we have seen how Rust ensures memory safety and data-race freedom by ruling out the 
combination of aliasing and mutation. However, there are cases where shared mutable state is 
actually needed to (efficiently) implement an algorithm or a data structure. To support these use- 
cases, Rust provides some primitives providing shared mutable state. All of these have in common 
that they permit mutation through a shared reference—a concept called interior mutability. 

At this point, you may be wondering—how does this fit together with the story of mutation 
and aliasing being the root of all memory and thread safety problems? The key point is that 
these primitives have a carefully controlled API surface. Even though mutation through a shared 
reference is unsafe in general, it can still be safe when appropriate restrictions are enforced by 
either static or run-time checks. This is where we can see Rust’s “extensible” approach to safety in 
action. Interior mutability is not wired into the type system; instead, the types we are discussing 
here are implemented in the standard library using unsafe code (which we will verify in §6). 


2.5.1 Cell. The simplest type with interior mutability is Cell. Consider the following example: 


1 let cl : &Cell <i32> = &Cell::new(Q); 

2 let c2 : &Cell <i32> = cl; 

3 el. set (2); 

4 println!("{:?}", c2.get(Q)); // Prints 2 


3usize is an unsigned integer type of platform-dependent size large enough to cover the address space. 
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The Cell<i32> type provides operations for storing and obtaining its content: set has type 
fn(&Cell<i32>, i32), and get has type fn(&Cel1<i32>) -> 132. Both of these only take a 
shared reference, so they can be called in the presence of arbitrary aliasing. So after we just spent 
several pages explaining that safety in Rust arises from ruling out aliasing and mutation, now we 
have set, which seems to completely violate this principle. How can this be safe? 

The answer to this question has two parts. First of all, Cell only allows getting a copy of the 
content via get; it is not possible to obtain a pointer into the content. This rules out deep pointers 
into the Cell, making mutation safe. Unsurprisingly, get requires the content of the Cell to be 
Copy. In particular, get cannot be used with cells that contain non-Copy types like Vec<i32>. 

However, there is still a potential source of problems, which arises from Rust’s support for 
multithreading. In particular, the following program must not be accepted: 


1 let c = &Cell::new(Q); 
2 join(|| e.setq), || printin! ("{:7}", ¢.getO)); 


The threads perform conflicting unsynchronized accesses to c, i.e., this program has a data race. 

To rule out programs like the one above, Rust has a notion of types being “sendable to another 
thread”. Such types satisfy the Send bound. The type of join demands that the environment 
captured by the closure satisfies Send. For example, Vec<i32> is Send because when the vector is 
moved to another thread, the previous owner is no longer allowed to access the vector—so it is fine 
for the new owner, in a different thread, to perform any operation whatsoever on the vector. 

In the case above, the closure captures a shared reference to c of type &Cel1<i32>. To check 
whether shared references are Send, there is another bound called Sync, with the property that type 
&T is Send if and only if T is Sync. Intuitively, a type is Sync if it is safe to have shared references 
to the same instance of the type in different threads. In other words, all the operations available on 
&T have to be thread-safe. For example, Vec<i32> is Sync because shared references only permit 
reading the vector, and it is fine if multiple threads do that at the same time. However, Cel1<i32> 
is not Sync because set is not thread-safe. As a consequence, &Ce11<i32> is not Send, which leads 
to the program above being rejected. 


2.5.2 Mutex. The Cell type is a great example of interior mutability and a zero-cost abstraction 
as it comes with no overhead: get and set compile to plain unsynchronized accesses, so the 
compiled program is just as efficient as a C program using shared mutable state. However, as 
we have seen, Cell pays for this advantage by not being thread-safe. The Rust standard library 
also provides primitives for thread-safe shared mutable state, one being Mutex, which implements 
mutual exclusion (via a standard lock) for protecting access to one shared memory location. Consider 
the following example: 


1 let mutex = Mutex::new(Vec::new()); 

2 join € || { let mut guard = mutex.lock().unwrap(); 

3 guard.deref_mut().push(Q) }, 

4 || € let mut guard = mutex.lock().unwrap(); 

5 println!("{:?}", guard.deref_mut()) } ); 


This program starts by creating a mutex of type Mutex<Vec<i32>> initialized with an empty vector. 
The mutex is then shared between two threads (implicitly relying on Mutex<Vec<i32>> being 
Sync). The first thread acquires the lock, and pushes an element to the vector. The second thread 
acquires the lock just to print the contents of the vector. 

The guard variables are of type MutexGuard<'a, Vec<i32>> where ‘a is the lifetime of the 
shared mutex reference passed to lock (this ensures that the mutex itself will stay around for at 
least as long as the guard). Mutex guards serve two purposes. Most importantly, if a thread owns a 
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guard, that means it holds the lock. To this end, guards provide a method deref_mut which turns 
a mutable reference of MutexGuard into a mutable reference of Vec<i32> with the same lifetime. 
Very much unlike Cell, the Mutex type permits obtaining deep pointers into the data guarded 
by the lock. In fact, the compiler will insert calls to deref_mut automatically where appropriate, 
making MutexGuard<'a, Vec<i32>> behave essentially like &' a mut Vec<i32>. 

Moreover, the guards are set up to release the lock when their destructors are called, which 
will happen automatically when the guards go out of scope. This is safe because, just like with 
index_mut (§2.4), the compiler ensures that deep pointers obtained through deref_mut have all 
expired by the time the guard is dropped. 


3 THE Arguss LANGUAGE AND TYPE SYSTEM 


In this section, we introduce Agyst: our formal version of Rust. The Rust surface language comes 
with significant syntactic sugar (some of which we have already seen). To simplify the formaliza- 
tion, Arust features only a small set of primitive constructs, and requires the advanced sugar of 
Rust’s surface language to be desugared into primitive constructs. Indeed, something very simi- 
lar happens in the compiler itself, where surface Rust is lowered into the Mid-level Intermediate 
Representation (MIR) [Matsakis 2016a]. Agust is much closer to MIR than to surface Rust. 

Before we present the syntax (§3.1), operational semantics (§3.2) and type system (§3.3) of Aust, 
we highlight some of its key features: 


e Programs are represented in continuation-passing style. This choice enables us to represent 
complex control-flow constructs, like labeled break and early return, as present in the 
Rust surface language. Furthermore, following the correspondence of CPS and control-flow 
graphs [Appel 2007], this makes Apust easier to relate to MIR. 

e The individual instructions of our language perform a single operation. By keeping the 
individual instructions simple and avoiding large composed expressions, it becomes possible 
to describe the type system in a concise way. 

e The memory model of Agus: supports pointer arithmetic and ensures that programs with 
data races or illegal memory accesses can reach a stuck state in the operational semantics. In 
particular, programs that cannot get stuck in any execution—a guarantee established by the 
adequacy theorem of our type system (Theorem 7.2)—are data-race free. 


3.1. The Syntax 
The syntax of Apgust is as follows: 
Path> p==x|p.n 
Val > v := false | true | z | €| funrec f(x) retk := F 
Instr 3 I= v | p| pi + p2 | Pi — pa | Pi S Po | Pi = po | new(n) | delete(n, p) 
| *p | pi = pe lpi =n “pol p = 0 | pi = pol pi =n “pol... 
FuncBody 3 F := letx = IinF | letcont k(x) := F, in F, | newlft; F | endl ft; F 
| if p then F, else F, | case *pof F | jumpk(x) | call f(x) retk 


We let path offsets n and integer literals z range over the integers, and sum indices i range over 
the natural numbers. The language has two kinds of variables: program variables, which are written 
as x or f, and continuation variables, which are written as k. 

We distinguish four classes of expressions: function bodies F consist of instructions I that operate 
on paths p and values v. Only the most basic values can be written as literals: the Booleans false 
and true, integers z, locations ¢ (see §3.2 for further details), and functions funrec f(x) ret k := F. 
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There are no literals for products or sums, as these only exist in memory, represented by sequences 
of values and tagged unions, respectively. Paths are used to express the values that instructions 
operate on. The common case is to directly refer to a local variable x. Beyond this, paths can refer 
to parts of a compound data structure laid out in memory: Offsets p.n perform pointer arithmetic, 
incrementing the pointer expressed by p by n memory cells. 

Function bodies mostly serve to chain instructions together and manage control flow, which is 
handled through continuations. Continuations are declared using letcont k(x) := F, in F:, and 
called using jump k(x). The parameters x are instantiated when calling the continuation. We allow 
continuations to be recursive, in order to model looping constructs like while and for. 

The “ghost instructions” newlft and endlft start and end lifetimes. These instructions have 
interesting typing rules, but do not do anything operationally. 

Functions can be declared using funrec f(x) ret k := F, where f is a binder for the recursive 
call, x is a list of binders for the arguments, and k is a binder for the return continuation. The return 
continuation takes one argument for the return value. Functions can be called using call f(x) ret k, 
where x is the list of parameters and k is the continuation that should be called when the function 
returns. 

Local variables of Agust—as represented by let bindings—are pure values. This is different 
from local variables in Rust (and MIR), which are mutable and addressable. Hence, to correctly 
model Rust’s local variables, we allocate them on the heap. Similar to prior work on low-level 
languages [Krebbers 2015; Leroy et al. 2012], we do not make a distinction between the stack and 
the heap. In practice, this looks as follows: 


fn option_as_mut<'a> funrec option_as_mut(x) ret ret := 
(x: &'a mut Option<i32>) -> let r = new(2) in 
Option<&'a mut i32> { 


dieteh 4% letcont k() := delete(1, x); jump ret(r) in 


None => None, ore x incase *yof 
: Some(ref mut t) => Some(t) rit "0: jump k() 
} — ry: jump k0) 


We see that the function argument x is a pointer, which is dereferenced when used and deallocated 
before the function returns. In this case, since the Rust program takes a pointer, x actually is a 
pointer to a pointer. Similarly, a pointer r is allocated for the return value. 

The Arust language has instructions for the usual arithmetic operations, memory allocation, and 
deallocation, as well as loading from memory (*p) and storing a value into memory (p; := pz). The 
memcpy-like instruction p; :=, “pz copies the contents of n memory locations from p, to p;. All of 
these accesses are non-atomic, i.e., they are not thread-safe. We will come back to this point in §3.2. 

The example above also demonstrates the handling of sums. Values of the Opt ion<i32> type are 
represented by a sequence of two base values: an integer value that represents the tag (0 for None 
and 1 for Some) and, if the tag is 1, a value of type 132 for the argument t of Some(t). If the tag is 0, 
the second value can be anything. The instructions p; :-—= p2 and p, =," ‘p2 can be used to assign 
to a pointer p; of sym type, setting both the tag i and the value associated with this variant of the 
union, while p; :-—= Vi is used for variants that have no data associated with them (like None). The 
case command is used to perform case-distinction on the tag, jumping to the n-th branch for tag n. 

There are more instructions available in the underlying core language, e.g., instructions to spawn 
threads or perform atomic accesses, including CAS (compare-and-swap). However, the type system 
does not provide any typing rules for these instructions, so they can only be used by unsafe code. 
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3.2 The Operational Semantics 


The operational semantics of Apust is given by translation into a core language. The core language is 
a lambda calculus equipped with primitive values, pointer arithmetic, and concurrency. We define 
the semantics this way for three reasons. First of all, we can model some of the Argust constructs 
(€.2., P1 ‘=n “p2) as sequences of simpler instructions in the core language. Secondly, we can reduce 
both continuations and functions to plain lambda terms. Finally, the core language supports a 
substitution-based semantics, which makes reasoning more convenient, whereas the CPS grammar 
given above is not actually closed under substitution. The details of the core language are fully 
spelled out in our technical appendix [Jung et al. 2017a]. 

The memory model is inspired by CompCert [Leroy et al. 2012] in order to properly support 
pointer arithmetic. On top of this, we want the memory model to detect and rule out data races. 
Following C++11 [ISO Working Group 21 2011], we provide both non-atomic memory accesses, on 
which races are considered undefined behavior, and atomic accesses, which may be racy. However, 
for simplicity, we only provide sequentially consistent (SC) atomic operations, avoiding considera- 
tion of C++11’s relaxed atomics in this paper. Notice that, like in C++, atomicity is a property of 
the individual memory access, not of the memory location. The same location can be subject to 
both atomic and non-atomic accesses. We consider a program to have a data race if there are ever 
two concurrent accesses to the same location, at least one of which is a write, and at least one of 
which is non-atomic. To detect such data races, every location is equipped with some additional 
state (resembling a reader-writer lock), which is checked dynamically to see if a particular memory 
access is permitted. We have shown in Coq that if a program has a data race, then it has an execution 
where these checks fail. As a consequence, if we prove that a program cannot get stuck (which 
implies that the checks always succeed, in all executions), then the program is data-race free. 

In our handling of uninitialized memory, we follow Lee et al. [2017]. Upon allocation, memory 
holds a poison value & that will cause the program to get stuck if it is ever used for a computation 
or a conditional branch. The only safe operations on ® are loading from and storing to memory. 


3.3. The Type System 


The types and contexts of Apust are as follows: 


Lft> k == a | static E:=@|E«&, x’ T:=0|T,psr|T,pa™ rc 
Mod 3 p := mut | shr Ls=@|L,.« Qk K := @ | K,k < cont(L; x. T) 
Type > t == T | bool | int | own, 7 | OE |4n | Ut | 2t | Va.fn(e:E;t) > 7 | pT.r 


Selected typing rules are shown in Figure 1 and Figure 2. We first discuss the types provided by 
the system, before looking at some examples. 

There are two kinds of pointer types: owned pointers own, t and (borrowed) references &/, T. 

Owned pointers own, 7 are used to represent full ownership of (some part of) a heap allocation. 
Because we model the stack using heap allocations, owned pointers also represent Rust’s local, 
stack-allocated variables. As usual, r is the type of the pointee. Furthermore, n tracks the size of the 
entire allocation. This can be different from the size of rt for inner pointers that point into a larger 
data structure.‘ Still, most of the time, n is the size of r, in which case we omit the subscript. 

References &7, t are qualified by a modifier 1, which is either mut (for mutable references, which 
are unique) or shr (for shared references), and a lifetime x. References &, t are borrowed for 
lifetime « and, as such, can only be used as long as the lifetime x is alive, i.e., still ongoing. Lifetimes 
begin and end at the newlft and endlft ghost instructions, following F-Newtrr and F-ENDLFT. 


4Such pointers can be obtained using C-sPLIT-OWN. 
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Furthermore, the special lifetime static lasts for the execution of the entire program (corresponding 
to 'static in Rust, which plays the same role). The type system is able to abstract over lifetimes, 
so most of the time, we will work with lifetime variables a. 

The type 4, describes arbitrary sequences of n base values. This type represents uninitialized 
memory. For example, when allocating an owned pointer (rule S-New), its type is own 4 ,. Owned 
pointers permit strong updates, which means their type t can change when the memory gets 
(re-)initialized. Similarly, the type changes back to own 4, when data is moved out of the owned 
pointer (rule TREAD-own-moveE). Note that this is sound because ownership of owned pointers is 
unique. 

The types IIT and x7 represent n-ary products and sums, respectively. In particular, this gives 
rise to a unit type () (the empty product II[]) and the empty type ! (the empty sum 2[]). We use 
T X T, and tT, + 72 as notation for binary products (II[7, t2]) and sums (2[11, t2]), respectively. 

Function types V@. fn(¢ : E;t) — t can be polymorphic over lifetimes @. The external lifetime 
context E can be used to demand that one lifetime parameter be included in another one. The 
lifetime ¢ here is a binder than can be used in E to refer to the lifetime of this function. For example, 
Va.fn(e : ¢ © a;&2 int) — () is the type of a function that takes a mutable reference to an 
integer with any lifetime that covers this function call (matching the implicit assumption Rust 
makes), and returns unit. Note that, to allow passing and returning objects of arbitrary size, both 
the parameters and the return value are transmitted via owned pointers; this calling convention is 
universally applied and hence does not show up in the function type. 

Finally, Agust supports recursive types 1 T. t, with the restriction (enforced by the well-formedness 
judgment shown in the appendix [Jung et al. 2017a]) that T only appears in t below a pointer type 
or within a function type. 

To keep the type system of Apust focused on our core objective (modeling borrowing and lifetimes), 
there is no support for type-polymorphic functions. Instead, we handle polymorphism on the meta- 
level: In our shallow embedding of the type system in Coq, we can quantify any definition and 
theorem over arbitrary semantic types (§4). We exploit this flexibility when verifying the safety 
of Rust libraries that use unsafe features (§6). These libraries are typically polymorphic, and by 
keeping the verification similarly polymorphic, we can prove that functions and libraries are safe 
to use at any instantiation of their type parameters. 

Type-checking the example. The typing judgments for function bodies F and instructions I have 
the shape T | E;L | K;T + FandI| E;L|T, + I 4x. T2. To see these judgments in action, we will 
go through part of the typing derivation of the example from §3.1. The code, together with some 
annotated type and continuation contexts, is repeated in Figure 3. Overall, we will want to derive a 
judgment for the body of option_as_mut, in the following initial contexts: 





T, := x: val,ret: val,a : Ift,¢ : Ift 


FE, =f. a@ 
Li:=- Of] 
K, := ret < cont(¢ & [];r.r < own (() + & 4 int)) 





T, = x d own &,, (() + int) 


The first context, the variable context I, is the only binding context. It introduces all variables 
that are free in the judgment and keeps track of whether they are program variables (x : val; 
this also covers continuations), lifetime variables (a : Ift), or type variables’ (T : type). All the 
remaining contexts state facts and assert ownership related to variables introduced here, but they 
do not introduce additional binders. 


5Type variables can only occur in the definition of recursive types. 
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Rules for lifetimes: 





(T | E;L+ «yj & kz and | E;L +t x alive) 














L LINCL-LOCAL LINCL-EXTERN L 
INCL-STATIC KCVKEL pale xLer’€E INCL-REFL 
IT |E;L+ xk E static 7 —_____ T|E;L+xEx 
T|E;Leeex 








T\|E;LEKER’ 
LINCL-TRANS 


TIE:LeeCxw T|E;Lex’ 
TIE:LeK EK” 


LALIVE-LOCAL 
KE, KEL Vi. E;L + kj alive 
IT | E;Lt x alive 





” 
K 














LALIVE-INCL 
T | E;L+ x alive T|E;Ltx 





In 
a 





E;Lt x’ alive 


Rules for subtyping and type coercions: (T | E;Lt ry = rg andT | E;L+ Ty s&s T2) 
T-BOR-LFT C-SUBTYPE 


C-coPy 
T|E;LEKEK’ T|E;L+ersc’ 














T copy 
t tT 
TELE aE r= &er T/ELepstSpar’ TIELtpsrSparpsr 
C-SHARE 
C-SPLIT-OWN abi T | E;Lt x alive 
E;L+ p 4d own, 7] X T2 © p.0 4 own, 7%), J OWNy T2 


ee 
TI EL+ ps Sut? > P&T 
C-REBORROW 
C-BORROW T|E;L+e x’ 





K 


T|E;L tp down, t Sp s&X rp s* own, T 





TIELi psa 7 Spake tpa® wr 
Rules for reading and writing: (C | E;L+ tr e-? rg and | E;L+ 1% —7 72) 
TREAD-BOR 
T copy T | E;Lt x alive 

P| E;Lt & ro? i 


TREAD-OWN-COPY 
T copy 


T | E;L+ own, t o—* owny Tt 


TREAD-OWN-MOVE 
n = size(T) 





IT | E;L+ own, t —* own fn 


TWRITE-OWN 


TWRITE-BOR 
size(r) = size(r’) 


T | E;Lt x alive 
P| E;L+ Sh yt —” &h 


mut © 








ha 
T | E;L+ own, t’ —0* owny T 


Rules for typing of instructions: 
S-NUM 
T|E;L|@+z4x.x dint 


(C | E;L|T+tI4x.T2) 
S-NAT-LEQ 


T | E;L | pi < int, p2 dintt+ pi < p2 4x.x < bool 
S-DELETE 

S-NEW n = size(r) 

T | E;L|@+ new(n) 4 x.x downy fn 





T | E;L| p< own, 7 t delete(n, p) 10 








S-DEREF S-SUM-ASSGN 7 
PIE Ley eo? size(t) = 1 Tj=T ry oF c/ 
TIEL|paut*pax.pdat,x<dr 


inji 


E;L| pi 4 t,.p2 tt pi :=poipisy 


Fig. 1. A selection of the typing rules of Apust (helper judgments and instructions). 
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Rules for typing of function bodies: (T | E;L|K;Tt F) 
F-LET 
F-CONSEQUENCE T|E;L|T,+lax.T2 
rIELETS 1’ K’ cK I | E;L’|K’;T’ + F T,x: val |E;L|K;T2,T+ F 
T|E:L|K:T+F I |E;L|K:T),T+ letx =JinF 
F-LETCONT 
T,k,x: val | E;L, | K,k < cont(L,;x.T’);T’ + Fy F-yump 
Tk: val | E;L2 | K,k < cont(L1;%.T’);T + Fy TJEL+ETS T[y/x] 
T | E;L2 | K;T + letcont k(x) := F, in Fp I | E;L|k < cont(L;x.T’);T + jump k(y) 
F-NEWLFT F-ENDLFT 
Tra: lft |E;L,aQ) «| K;TeF TIEL|K:WeF ToT’ 
IT | E;L|K;Tt newlft; F I | E;L,« &) K | K;T+ endlft; F 
F-CASE-BOR 


T | E;Lt x alive Vi.(T | E;L | K;T,p.1 < &/, 7 + Fi) 





T|E;L|K;T,p < &/ tt case *pofF 


F-CALL F 
T|E;L+E TS x S own7,T’ I | E;Lt x alive Tye: lft | E,¢ Ce K;LEE’ 


I | E;L|k <a cont(L;y. y < own 7,T’);T, f < fn(¢ : E’;T) > r+ call f(x) retk 








Fig. 2. A selection of the typing rules of Apust (function bodies). 


Our initial variable context consists of the parameter x, our return continuation ret, the lifetime 
a (corresponding to ' a), and the lifetime ¢, which (by convention) is always the name of the lifetime 
of the current function. This lifetime is used in the external lifetime context E to state that a outlives 
the current function call. Rust does not have a direct equivalent of ¢ in its surface syntax; instead it 
always implicitly assumes that lifetime parameters like ' a outlive the current function. 

The typing context T is in charge of describing ownership of local variables. It mostly contains 
type assignments p < T. It is important to stress that the typing context is substructural: Type 
assignments can only be duplicated if the type satisfies t copy (C-copy), corresponding to Rust’s 
Copy bound. In this case, we have a single variable x (our argument), which is an owned pointer to 
&raut (() + int), the Arust equivalent of the Rust type &'a mut Option<i32>. As already mentioned, 
the additional owned pointer indirection here models the fact that x on the Rust side has an address 
in memory. 

We mostly use F-Ler to type-check the function one instruction at a time. The first instruction is 
new, so we use S-new. That extends our typing context with r being an uninitialized owned pointer: 


x down &7 , (() + int), rd own f2 


Next, we declare a continuation (letcont k() := ...). Continuations are tracked in the continua- 
tion context K. Initially, we already have our return continuation ret of type cont(¢ & [];r.r < 
own (() + &@_, int)) in that context. This says that ret expects one argument r of our return type, 
Option<&'a mut i32>. 

The continuation also makes assumptions about the local lifetime context L at the call site, which 
we will discuss soon. As usual with CPS, since the return type is given by the return continuation, 
the function judgment does not have a notion of a return type itself. 

The function option_as_mut declares a continuation k to represent the merging control flow 


after the case. Following F-teTconT, we have to pick T’, the typing context at the call site of the 
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funrec option_as_mut(x) ret ret := 
{K :ret <d cont(¢ & [];r.r < own (() + &i 4 int));T : x < own &F (OQ + int)} 
let r = new(2) in 





{T :X down & . (() + int), r < own ta} 
letcont k() := delete(1, x); jump ret(r) in 
{K :ret d...,k < cont(¢ & [];r < own (() + &hyt int), x < own 41)} 
let y = *xin 
{T :X down 41,7 down fo,y dan (+ int)} 
case "y of 
—r 20); jump k() 
— {T: x down §1,r down J2,y.1 4&2, int} 
r nee 
{T :X down $1,r down (()+ & int)} 


jump k() 
Fig. 3. Example code with annotated type and continuation contexts. 


continuation. It turns out that the right choice is r < own (() + &%, int), x < own 4;. Let us omit 


checking that the continuation actually has this type, and continue on with the following new item 
in our continuation context: 


k < cont(¢ & [];r < own (() + &4,, int), x < own 41) 


Next, the code dereferences the argument (let y = *x), which unwraps the additional owned 
pointer indirection that got inserted in the translation. Dereferencing is type-checked using S-pEREF. 
This rule uses a helper judgment: [| E;L + t, o—* t2 means that we can read a value of type t (the 
pointee) from a pointer of type t,, and doing so will change the type of the pointer to 7). In this 
case, we derive own &@ , (() + int) o—%mut O+int) own 4, from TREAD-own-Move. The type of the 
pointer changes because we moved the content out of the owned pointer. Effectively, x is now no 
longer initialized. After this instruction, our typing context becomes: 


X Jd own §1,rF < own fo, y < & (() + int) 


Next, we have to type-check the case using F-casz-zor, which involves loading the tag from y. 
Because we are dereferencing a reference (as opposed to an owned pointer) here, the type system 
requires us to show that the lifetime (q) is still alive. This is where the lifetime contexts E and L 
come in: We have to show E;L t @ alive. 

To this end, we first make use of the external lifetime context E, which tracks inclusions between 
lifetime parameters and the lifetime ¢ of the current function. Concretely, we make use of ¢ E, a 
and apply Lative-1nct, which reduces the goal to E;L + fF alive: Because ¢ is shorter than a, it 
suffices to show that ¢ is still alive. In the second step, we employ our local lifetime context L, which 
tracks lifetimes that we control. Elements of this context are of the form x € x, indicating that x is 
a local lifetime with its superlifetimes listed in x. The rule Lative-tocat expresses that x is alive as 
long as all its superlifetimes are alive. Because ¢ has no superlifetimes (¢ € []), this finishes the 
proof that ¢ is alive, and so is a. 
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The local lifetime context also appears in the types of continuations: Both of our continuation 
expect the local lifetime context at their call site to be ¢ € []. In other words, ¢ has to be still alive 
when the continuation is invoked. In particular, this means that option_as_mut cannot end ¢, or 
else it would not be able to call its return continuation. 

Having discharged the first premise of F-casr-Bor, let us now come to the second premise: 
showing that all the branches of the case distinction are well-typed. The case distinction operates 
on a pointer to () + int, so in the branches, we can assume that y.1 (the data stored in the sum) is a 
pointer to () or int, respectively. The second case is the more interesting one, where we go on with 


the following typing context: 
X down 41,6 down $2, y.1 < &i,, int 


The next instruction is r === y.1, which is type-checked using S-sum-asscn. Again the main work 
of adjusting the types is offloaded to a helper judgment: I | E;L + 7; —o* tT) means that we can 
write a value of type t to a pointer of type 7, changing the type of the pointer to 7». In this case, 
we derive [ | E;L + own $5 —o9t&mut Mt own (() + och ut int) using Twrire-own. This is a strong 
update, which changes the type of r from uninitialized to the return type of our example function. 


Our context thus becomes: 
X down 41, r 4 own (() + Gut int) 


Notice that y.1 disappeared from the context; it was used up when we moved it into r. 

Finally, we jump to the continuation k that we declared earlier. This is type-checked using F-jump, 
which verifies that our current typing context T and local lifetime context L match what is expected 
by the continuation. 

Further noteworthy type system features. Besides the type assignments we have already seen, the 
type context can also contain lifetime-blocked type assignments p <'* r. Such assignments are 
introduced when creating a reference (C-sorrow, C-REBORROW), which blocks the referent until the 
lifetime of the reference ends (F-rNpirT), as expressed by the unblocking judgment T >"* T’. 

External lifetime context satisfaction I | E;L + E’ is used on function calls to check the as- 
sumptions made by the callee (F-ca11). The < in F-catt indicates that we are requiring a list of 
type assignments in the context, matching a list of variables (x) with an equal-length list of types 
(own 7). 

Subtyping is described by [| E;L + 71 = t2. The main forms of subtyping supported in Rust 
are lifetime inclusion (T-sor-trr) and (un)folding recursive types. Apart from that, there are the 
usual structural rules witnessing covariance and contravariance of type constructors. On the type 
context level, T | E;L+ T; S Ty lifts subtyping (C-sustyre) while also adding a few coercions that 
can only be applied at the top-level type. Most notably, a mutable reference can be coerced into 
a shared reference (C-sHarz), an owned pointer can be borrowed (C-Borrow) to create a mutable 


reference, and a mutable reference can be reborrowed (C-rEBorRow).° 


4 RUSTBELT: A SEMANTIC MODEL OF Agust TYPES IN IRIS 


Our proof of soundness of Agust proceeds by defining a logical relation, which interprets the types and 
typing judgments of Apust as logical predicates in an appropriate semantic domain. We focus here 
on the interpretation of types, leaving the interpretation of typing judgments and the statements of 
our main results to §7. First, in §4.1, we give a simplified version of the semantic domain of types. 
In §4.2, we give the semantic interpretation of some representative Agust types. Finally, in §4.3, we 
focus on the interpretation of shared reference types. It will turn out that we have to generalize 
our semantic domain of types to account for them. 


There is no need to reborrow shared references because they are duplicable, and hence using subtyping to a shorter lifetime 
does not lose any information. 
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4.1. A Simplified Semantic Domain of Types 


The semantic domain of types answers the question “What is a type?”. Usually, the answer is that a 
type denotes a set of values—or, equivalently, a predicate over values. Fundamentally, this is also the 
case for Agust, but the details get somewhat more complicated. First of all, our model of the type 
system of Apust expresses types not as predicates in “plain mathematics” (e.g., the usual higher-order 
logic), but as predicates in Iris. As discussed in the introduction, Iris is a higher-order separation 
logic designed to prove correctness of complex concurrent programs. Using Iris to express types 
has the advantage that concepts like ownership are already built into the underlying framework, 
so the model itself does not have to take care of them. 

Rather than try to explain all the features of Iris here, we will introduce them en passant, as 
needed. However, one that is worth mentioning up front is the ability to define predicates by 
guarded recursion. This means that a predicate can refer to itself recursively, but only below a> 
(“later”) modality [Appel et al. 2007] or some other appropriate “guard”. The use of a guard ensures 
that the circular definition can be solved—regardless of whether the recursive reference occurs 
positively, negatively, or both—using the technique of “step-indexing” [Appel and McAllester 2001]. 
For this reason, > appears in various places in our model; the placement of these >’s is important 
for soundness, but is not otherwise relevant to our high-level exposition, so we will mostly ignore 
it in the rest of the paper. 

Our interpretation of types associates to every type rt an Iris predicate [r].own € TIdxlist( Val) 
iProp. This predicate takes two parameters and returns an Iris proposition (of type iProp). The 
second parameter is the list of values we are considering. It turns out that types in Rust do not just 
cover a single value: In general, data is laid out in memory and spans multiple locations. However, 
we have to impose some restrictions on the lists of values accepted by a type: we require that every 
type has a fixed size [r]].size. This size is used to compute the layout of compound data structures, 
e.g., for product types. We require that a type only accepts lists whose length matches the size: 


[r].own(t, 0) => |d| = [r].size (TY-sIzE) 


Furthermore, for Copy types we require that [r].own(t, 0) be persistent. In Iris, a proposition is 
considered persistent if it does not describe ownership of any exclusive right or resource, and can 
therefore be freely copied and shared among several parties. 

The first parameter of the predicate (of type Td) permits types to moreover depend on the thread 
identifier of the thread that claims ownership. This is used for types like &Ce1l1 that cannot be 
sent to another thread. In other words, ownership is (in general) thread-relative. As we explained in 
§1.2, this provides a very natural way of modeling Send: Semantically speaking, a type rt is Send if 
[z].own does not depend on the thread id. We will see more details about this in §6.1, when we 
give the interpretation of Cell, a type that cannot be shared across threads. 


4.2 Interpreting Types 


Now that we have a semantic domain of types, we can define their semantic interpretation as a 
function from syntactic types tT into the semantic domain. In this paper, we focus on the most 
representative types. The full interpretation can be found in the technical appendix [Jung et al. 
2017a]. 


Booleans. To get started, let us consider a very simple type: bool. It should not come as a surprise 
that [bool].size := 1. The semantic predicate of a Boolean is defined as follows: 


[bool].own(t, 0) := 0 = [true] Vv v = [false] 
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In other words, a Boolean can only be a singleton list (which is already expressed by its size), and 
that list has to contain either true or false. 
Unsurprisingly, the semantic interpretation of integers is similar and equally straightforward. 


Products. Given two types 1; and 12, we define the semantics of their binary product 1 X tT as 
that of the two types laid out one after the other in memory. This definition can be iterated to yield 
the interpretation of n-ary products. 

For the size, we have [1 X t2]].size := [7]].size + [z2]].size. The semantic predicate associated 
with t, X Tz uses separating conjunction (P * Q), the defining feature of separation logic, to join the 
semantic predicates of both types. The separating conjunction ensures that they describe ownership 
of disjoint pieces of memory. (Here, + is list concatenation.) 


[71 X t2]].own(t, 0) := A01, V2. U = DV + Ve * [4 ]]-own(t, 01) * [z2]].own(t, D2) 


Owned pointers. In order to give a semantic interpretation to the type own, T of owned pointers, 
we use the standard points-to proposition of separation logic, € +> v. It states that, starting at 
location f, the memory contains the values ¥, and asserts ownership of this memory region. With this 
ingredient, the interpretation is given by [own,, t].size = 1 and the following semantic predicate: 


[own,, t].own(t, 0) := Fé. 0 = [€] * dw. € HW *> [r].own(t, w) «> DeallocSize(€, n, [r].size) 


Rust supports recursive types whenever the recursive occurrence is below a pointer indirection. 
To properly model this using Iris’s guarded recursive definitions, we have to make sure that all 
uses of t are guarded—in this case, by adding a >. 

The proposition DeallocSize(f, n, |[7]].size) in the semantic predicate above manages the right to 
deallocate the location €. These details can be found spelled-out in our technical appendix [Jung 
et al. 2017a]. 


Mutable references. Mutable references, like owned pointers, are unique pointers to something 
of type t. The key difference is that mutable references are borrowed, not owned, and hence they 
come with a lifetime indicating when they expire. In standard separation logic, an assertion always 
represents ownership of some part of the heap, for an unlimited duration (or until the owner actively 
decides to give it to another party). Instead, a mutable reference in Rust represents ownership for 
a limited period of time. When this lifetime of the reference is over, a mutable reference becomes 
useless, because the original owner gets back the full ownership. 

To handle this new notion of “ownership with an expiry date”, we developed a custom logic 
for reasoning about lifetimes and borrowing. It is called the lifetime logic. This logic is embedded 
and proven correct in Iris, and we describe it in §5. Most importantly, for an Iris assertion P anda 
lifetime x, the lifetime logic defines an assertion & |, P, called a full borrow, representing ownership 
of P for the duration of lifetime x. Using full borrows, the interpretation of the type of mutable 
references is as follows: 


[Shut T]-size:-=1 [&h,,7].own(t, 0) := Ae.0 = [€] « all 


fal (2 € +> Ww * [r]. own(t, w)) 


This is very similar to the interpretation of own, T, except that the assertion describing ownership 
of the contents of the reference (Aw, € + w+ |r]. own(t, w)) is wrapped in a full borrow (at lifetime 


x) instead of being owned directly. Finally, it turns out that all P already functions as a guard of 
P, so there is no need for us to add any extra later modality >. 


4.3. Interpreting Shared References 


The interpretation of shared references &%, 7 requires more work than the types we considered so 


far. Usually, we would proceed as we did above: Define [&*, t].own based on [z].own such that 
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all the typing rules for &%, 7 work out. Most of the time, this does not leave much room for choice; 
the primitive operations available for the type almost define it uniquely. This is decidedly not the 
case for shared references, for it turns out that, in Rust, there are hardly any primitive operations 
on &T. The only properties that hold for &T in general is that it can be created from a &mut T, it is 
Copy, it has size 1, and its values have to be memory locations. However, as we have seen in §2, 
types like Cell or Mutex do provide some very interesting operations on shared references, e.g., 
providing indirect mutable access through a shared reference. 

To account for this freedom, we permit every type to pick its own sharing predicate. We then use 
the sharing predicate of r to define [&*, r].own. This permits, for every type, a different set of 
operations on its shared references. For example, the sharing predicate for basic types like bool 
allows read-only access, while the sharing predicate for Mutex<T> allows read and write accesses 
to the underlying object of type T once the lock has been acquired. 

More formally, we extend the semantic domain of types and associate to each of them another 
predicate [r].shr € Lft x TId x Loc > iProp, and use it directly to model shared references: 


[Snr 7] -Size := 1 [Sone T]-own(t, v) :=3€. v = [€] * [7] .shr([x], t, €) 


The [r]].shr predicate takes three parameters: the lifetime x of the shared reference, the thread 
identifier t, and the location ¢ constituting the shared reference itself. Just like Send expresses 
that [[r]].own does not actually depend on the thread identifier (see §4.1), we define Sync to mean 
that [z].shr does not depend on the thread identifier. To support the aforementioned primitive 
operations on &1, the sharing predicate has to satisfy the following properties: 


persistent([z].shr(x, t, €)) (TY-SHR-PERSIST) 
(&§ (aw. € & w * [r].own(t, w)) * [k]q) =K (([r]-shr(x, t, €) * [k]q) (TY-sHARE) 
kK’ OKA [r].shr(x, t, €) = [r]-shr(x’, t, €) (TY-SHR-MONO) 


First, ry-sur-persist requires that [r].shr be persistent, which implies that [&*, . t].own(t, v) is 


persistent. This corresponds to the fact that, in Rust, shared references are always Copy. 

Second, ry-suAre asserts that shared references can be created from mutable references: This is the 
main ingredient for proving the rule C-suare of the type system. Looking at this rule more closely, 
its first premise is a full borrow of an owned pointer to t. This is exactly [&*,,, t].own(t, [€]). Its 
second premise is a lifetime token [xk] q? which, as we will explain in §5, witnesses that the lifetime 
is alive and permits accessing borrows. Given these premises, Ty-sHARE states that we can perform 
an update, denoted by the Iris connective =.’ This update will safely transform the resources 
described by the premises into those described by the conclusion, namely t’s sharing predicate 
along with the same lifetime token that was passed in. 

Third, ry-sur-mono requires that [r].shr be monotone with respect to the lifetime parameter. 
This is important for proving the subtyping rule T-sor-trr. 

The addition of the sharing predicate completes our description of the semantic domain of 
types: Each type r is interpreted by a tuple [r] = (size, own, shr) of a natural number and two Iris 
predicates that satisfy Ty-sizz, Ty-sHR-PERSIST, TY-SHARE and Ty-sHr-Mono. Let us now go back to the 
types we already considered above and define their sharing predicates. 


Sharing predicate for products. The sharing predicate for products is simply the separating 
conjunction of the sharing predicates of the two components: 


[t1 X t2]].shr(x, t, €) = [1]-shr(x, t, €) * [2]]-shr(x, t, € + [rm ]]-size) 
The location used for the second component is shifted by [,].size, reflecting the memory layout. 


The connective P = Q is in fact a shorthand for P -« &Q in Iris [Jung et al. 2017b]. 
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Sharing predicate for simple types. It turns out that there is a common pattern for defining the 
sharing predicates of many basic types: Indeed, when no interior mutability is at play, a shared 
reference provides read-only access. This sharing predicate can be used for any Copy type t of 
size 1. In this case, |r]].own(t, v) can be written in the following form: 


[z].own(t, 0) = Av. 0 = [v] * @,(t, v) 


where @, is a persistent predicate. This is the case, for example, for bool, int, function types, and 
shared references &%, 7 themselves. For these types, we use the following sharing predicate: 


[z]-shr(x, t, €) = Ju. & (Ag. € 4 v) «> @-(t, v) 


frac 


This definition says that there exists a fixed value v (the current value the reference points to) 
such that ®, holds under the later modality > (recall that shared references are pointers, and hence 
occurrences of t need to be guarded to enable construction of recursive types), and that we have a 


fractured borrow & (Aq. € 4S v) of the ownership of the memory cell. 


Fractured borrows are another notion provided by the lifetime logic: Similarly to full borrows, 
they represent temporary ownership of some resource, limited by a given lifetime. The difference 
is that they are persistent, but only grant some fraction of the content. Fortunately, that is all that is 
needed in order to support a read of the shared reference. 


5 LIFETIME LOGIC 


In §4, we gave a semantic model for Agust types, but we left some important notions undefined. In 
particular, we used the notion of a full borrow &;,, P in the interpretation of mutable references 
to reflect that this kind of ownership is temporary and will “expire” when lifetime x ends; we 
mentioned lifetime tokens [x], as a resource used to witness that a lifetime is ongoing; and we 
employed fractured borrows &;,, P in the sharing predicate of simple types. 

In this section, we describe the lifetime logic, a library we have developed in Iris to support these 
notions. In the paper, we focus on discussing the proof rules provided by the library and show 
how the lifetime logic can be used to model temporary and potentially shared ownership of Iris 
resources. More details can be found in our technical appendix and in our Coq development [Jung 
et al. 2017a]. 

We start by presenting the two core notions of lifetimes and full borrows in §5.1. We then continue 
in §5.2, explaining how lifetimes can be compared and intersected. Finally, in §5.3, we present 
fractured borrows, which we have already seen as being useful for defining sharing predicates. 


5.1. Full Borrows and Lifetime Tokens 


Figure 4 shows the main rules of the lifetime logic. We explain them by referring to the following 
Rust example, similar to the one in §2.4: 


1 let mut v = Vec::new(); v.push(@); 
2 { let mut head = v.index_mut(@); *head = 23; } 
3S println’cy(.7}",. ¥)¢ 


Recall the type of index_mut: for<'a> fn(&'a mut Vec<i32>, usize) -> &'a mut i32. To 
call this function, we need a borrow at some lifetime x (which we will use to instantiate ' a). To 
get started, we need to create this lifetime. This is the role of LrrL-srcm: it lets us perform an 
Iris update to create a fresh lifetime « and gives us the full lifetime token [x]; witnessing that this 
lifetime is ongoing. (This token can then be split into fractional lifetime tokens [x],—see below.) It 
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LFTL-BEGIN LFTL-TOK-FRACT LFTL-NOT-OWN-END 


True =k Ak. [x], * ([k], =k [1x«]) [K]gaq @ Uk] * [k]y [x], * [tx] = False 


LFTL-END-PERSIST LFTL-BORROW LFTL-BOR-SPLIT 


persistent([+x]) >P =k &&F PP» ([tx] SK > P) &Ey(P * Q) SK &Ey P * &i yO 


LFTL-BOR-ACC LFTL-BOR-SHORTEN LFTL-INCL-ISECT 
K K , 4 i= 
Bi P* [lg SK >P*(>P=K ay P*lkly) Cee aey PS &eyP KK’ EK 

















LFTL-INCL-GLB LFTL-TOK-INTER LFTL-END-INTER 

KOK *KOK’ > KEK’ Nk” [kK], @ [k], * [Kg [tk K’] © [TK] Vv [tK’] 
LFTL-TOK-UNIT LFTL-END-UNIT LFTL-REBORROW 
True = [é], [te] => False KCK * & PSK oi Pt ([tx’] =k ea P) 


Fig. 4. Selected rules of the lifetime logic. 


also provides the update [x], =3K [+x]: we will use this update later to end x by exchanging the 
full lifetime token [x], for a dead token, written [+x], indicating that x has ended.® 

Once the lifetime has been created, we can borrow the vector v at the lifetime x in order to pass 
a borrowed reference to index_mut. This is allowed by LrrL-sorrow, really the core rule of the 
lifetime logic. This rule splits ownership of a resource P (in our example, the vector v) into the 
separating conjunction of a full borrow &£ , P and an inheritance [+x] = > P. The borrow grants 
access to P during the lifetime x, while the inheritance allows us to retrieve ownership of P after 
« has ended. In other words, LrrL-zorrow splits ownership in time. The separating conjunction 
indicates that the two operands are “disjoint”, which means we can safely transfer ownership of the 
borrow to index_mut and keep ownership of the inheritance for ourselves to use later. Except here, 
this is not disjointness in space (e.g., in the memory), since both the borrow and the inheritance 
grant access to the same shared resource. Rather, it is disjointness in time: The lifetime x is either 
ongoing or ended, so the borrow and the inheritance are never useful at the same time. 

We do not give the actual implementation of index_mut in this paper. However, here is what 
index_mut does with respect to ownership. First, the ownership of the memory used by the vector 
(“inside” the full borrow) is split into two parts: (1) The ownership of the accessed vector position, 
and (2) the ownership of the rest of the vector. Then, the rule LrrL-zor-spuir is used to split the full 
borrow into two full borrows dedicated to each of these parts. The full borrow of part (1) is returned 
to the caller; this matches the return type of index_mut. On the other hand, the full borrow of part 
(2) is dropped.” This means that the ownership of the rest of the vector is effectively lost until the 
lifetime ends, at which point it can be recovered using the inheritance. 

The next step of our program is the write to head on line 2. Recall that the type of head is 
&mut 132, which represents ownership of a full borrow of a single memory location. In order to 
perform this write, we need to access this full borrow and get the resource it contains (in particular, 
the maps-to predicate € +> v). This is what LrrL-sor-acc does: If we give it a full borrow &® ), P 
and a lifetime token [x] as witnessing that x is alive, then we get the resource P. Moreover, we also 


8Note that the ending update uses an “update that takes a step” =>K rather than a normal update =k. This connective, 
which is defined in the appendix [Jung et al. 2017a] and is required for technical reasons related to step-indexing, restricts 
the update to only be used in conjunction with reasoning about a physical step of computation. 

Iris is an affine logic, in which it is possible to give up ownership of resources at any time, ie., Iris has the law P * Qt P. 
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get the update > P =k && , P * [x] g: This can later be used when we are done with P, in order to 
reconstitute the full borrow and get back the lifetime token. In our proofs, we will always be forced 
to give back all the lifetime tokens that we obtained; this makes sure that we properly close all 
borrows again. This can be seen, for example, in ry-sHare: A token is provided as a premise to this 
update, but the same token must also be returned again in the conclusion. 

Finally, at the end of line 2 of our example, head goes out of scope and it is time to end x. To 
this end, we apply the update [x], =k [tx] that we obtained when x was created. In doing so, we 
have to give up the lifetime token [x], (ensuring that all borrows are closed again), but we get back 
the dead token [tx], which can be used to prove that «x has indeed ended. Now that x has ended, 
we can use our inheritance [+x] =k > P to get back the ownership of v before printing it. Note 
that the dead token [7x] is persistent (LrTL-enpD-PERsIsT), so it can be used multiple times—this is 
important since there may be many borrows (and thus many inheritances we wish to use) at the 





same lifetime. Each inheritance, however, may only be used once. 

One important feature of the lifetime logic that this example does not demonstrate is the pa- 
rameter q, a fraction. Lifetime tokens can always be split into smaller parts, in a reversible fashion 
(LrrL-rox-rract). This is needed when we want to access several full borrows with the same lifetime 
at the same time, or to witness that a lifetime is ongoing in several threads simultaneously. Moreover, 
unsurprisingly, a lifetime cannot be both dead and alive at the same time (LrrL-NoT-own-END). 


5.2 Lifetime Inclusion 


In §2 and §3, we have seen that Rust relates lifetimes by lifetime inclusion. This is used for subtyping 
(T-sor-LFT) and reborrowing (C-REBORROW). 

What does it mean for a lifetime x to be “included” in another x’? The key property of lifetime 
inclusion is that when the shorter x is still alive, then so is the longer x’. From the perspective of 
lifetime tokens, this means that, given a token for x, we should be able to obtain a token for x’. 
Conversely, given a dead token for x’, we should be able to obtain a dead token for x, as well. This 
is reflected in the definition of lifetime inclusion: 


Ex 3 (va. Ik], =k Fg’. [k’]q * ([k’]y =k [x1q)) * ([tk’] =k [*x])) 


a 
n 
a 
ll 
oO 





The first part says that we can trade a fraction of the token of x for a potentially different fraction 
of the token of x’. It also provides a way to revert this trading to recover the original token of x, 
so that no token is permanently lost. The second part of this definition is the analogue for dead 
tokens. Note that since dead tokens are persistent, it is not necessary to provide a way to recover 
the dead token that is passed in. The entire definition is wrapped in Iris’s persistence modality 0 to 
make lifetime inclusion a persistent assertion that can be reused as often as needed. 

It is easy to show that lifetime inclusion is a preorder. Inclusion can be used to shorten a full 
borrow (LrrL-sor-sHorren): If a full borrow is valid for a long lifetime, then it should also be valid 
for the shorter one. This rule justifies subtyping based on lifetimes in Agust. 

An even stronger use of lifetime inclusion is reborrowing, expressed by LrrL-resorrow. This 
rule is used to prove the reborrowing rule in the type system, C-resorrow. Unlike shortening, 
reborrowing provides an inheritance to regain the initial full borrow after the shorter lifetime has 
ended. This may sound intuitively plausible, but turns out to be extremely subtle. In fact, most of 
the complexity in the model of the lifetime logic arises from reborrowing. 


5.2.1 Lifetime intersection. Beyond having a preorder, it turns out that lifetimes also have a 
greatest lower bound: Given two lifetimes x and x’, their intersection k x’ is the lifetime that ends 
whenever either of the operands ends. 
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Fig. 5. Selected rules for fractured borrows. 


Lifetime intersection is particularly useful to create a fresh lifetime that is a sublifetime of some 
existing x. We invoke the rule LrrL-zecin to create an auxiliary lifetime ap, and then we use the 
intersection @ := a 1k as our new lifetime. It follows that a E x. In the type system, we use this 
in the proof of F-New rr to create a new lifetime a that is shorter than all the lifetimes in x. 

Intersection of lifetimes interacts well with lifetime tokens: A token of the intersection is com- 
posed of tokens of both operands, at the same fraction (LrrL-rox-1nTeEr). In other words, in order 
to prove that an intersection is alive, we have to prove that both operands are alive. Similarly, in 
order to prove that an intersection has ended, it suffices to prove that either operand has ended 
(LrrL-END-INTER). These laws let us do the token trading required by lifetime inclusion, showing 
that intersection indeed is the greatest lower bound for E (LrrL-1nct-tsecr, LrrL-1Nci-G1s). 

Furthermore, intersection has a unit ¢. This lifetime never ends (LrrL-enp-unir) and we can freely 
get tokens for it (LrrL-rox-uniT). We use € to model the static lifetime. 


5.3. Fractured Borrows 


Full borrows and lifetimes are powerful tools for modeling temporary ownership in Iris. However, 
they cannot be used as-is for modeling Rust’s shared references. In §4.3, we used the notion of 
fractured borrows as our key notion for defining the default read-only sharing predicate. Figure 5 
gives the main reasoning rules for fractured borrows. 

To make it possible to use them as a sharing predicate, fractured borrows are persistent and, just 
like full borrows (LrrL-sor-sHorten), they can be shortened. Because they are persistent, fractured 
borrows can potentially be accessed simultaneously by several parties. As such, they cannot provide 
access to the full underlying resource. Instead, LrrL-rract-acc provides access only to some fraction 
of the borrowed content. 

To express this, fractured borrows work on a predicate ® over fractions that has to be compatible 
with addition: B(q; + q2) <= (qi) * (q2). When using LrrL-rracr-acc to access the content 
of the fractured borrow, we get &(q) for some unknown fraction g. This works because no matter 
how many threads access the same fractured borrow at the same time, it is always possible to give 
out some tiny fraction of © and keep some remainder available for the next thread. Similarly to full 
borrows, LrrL-rract-acc requires a lifetime token for witnessing that the lifetime is alive, and gives 
back the lifetime token only when the resource is returned. 

Fractured borrows can be created from a full borrow of @(1) using the LrrL-sor-rracrvre rule. 


5.3.1 Lifetime inclusion and fractured borrows. Fractured borrows have an interesting interaction 
with lifetime inclusion. Assume we have a fractured borrow of lifetime token for another lifetime x’. 
That is, assume &(q’) = [k’],. The rule LrrL-rract-acc for accessing fractured borrows turns out 
to be exactly the first part of the token trading scheme that we used for defining the lifetime 
inclusion x E x’. In fact, by using some further properties of fractured borrows (see our technical 
appendix [Jung et al. 2017a]), we can also prove the trading scheme for dead tokens, so that we 
have: 


y, 





&K 


frac 


Aq’. [k']q => KEK 
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This can be generalized to fracturing just a part of the token, i.e., ®(q’) = [k’]q:.g. The reason this 
makes sense is that with the token of a longer lifetime x’ being borrowed at some shorter lifetime x, 
it is impossible to end x’ while x is still ongoing: Ending x’ needs the full token, but part of that 
token is stuck in x and can only be recovered through an inheritance. 

Deriving lifetime inclusions from fractured borrows significantly expands the power of lifetime 
inclusion. So far, we have seen that we can use lifetime intersection to make a fresh a a sublifetime 
of some existing x; however, for this to work out, we have to decide in advance which other lifetimes 
a is going to be a sublifetime of. Using fractured borrows, we can establish additional lifetime 
inclusions dynamically, when the involved lifetimes are already ongoing and in active use. It turns 
out that interior mutable types like RefCel1<T> or RwLock<T> allow sharing data structures for a 
lifetime that cannot be established in advance, and we thus found this new scheme for proving 
lifetime inclusion crucial in proving the safety of such types. 


6 MODELING TYPES WITH INTERIOR MUTABILITY 


As we have discussed in §2.5, the standard library of Rust provides types with interior mutability. 
These types, written in Rust using unsafe features, can nonetheless be used safely because the 
interface they provide to client code encapsulates these unsafeties behind well-typed abstractions. 
We have proven the safety of several such libraries, namely: Cell, RefCel1, Mutex, RwLock, Rc, 
and Arc.'° To fulfill this goal, we had to first pick semantic interpretations for the abstract types 
exported by these libraries (e.g., Cel1<T>). We then proved that each publicly exported function 
from these libraries satisfies the semantic interpretation of its type. 

Usually, when modeling types with interior mutability, the most difficult definition is that of the 
sharing predicate [r].shr. Indeed, these types use a sharing predicate which is different from the 
default, read-only one that we described in §4.3. The sharing predicates vary greatly depending 
on which operations are allowed. Most of them use a new variant of borrow propositions, called 
persistent borrows, which we present in this section. As it turns out, all the variants of borrow 
propositions (including full and fractured borrows) are encodable in terms of a single internal 
mechanism, called indexed borrows, but the explanation of this encoding would take us too far afield 
(details are explained in the technical appendix [Jung et al. 2017a]). We focus our explanations on 
two representative forms of interior mutability that we have already presented in §2: Cell and 
Mutex. 


6.1. Cell 


In §2.5.1, we have seen that Cel1<T> stores values of type T and provides two functions: get and 
set, which can be used for reading from and writing to the cell. It turns out that ownership and 
size of cell(r), the equivalent of Cel1<T> in Agust, are the same as T. In fact, Rust’s standard library 
provides two functions for converting between T and Cell<T>, Cell: :newand Cell: :into_inner, 
both of which are effectively the identity function. 

The sharing predicate is where things get interesting. Remember that get and set can be called 
even if you only have a shared reference to a Cell<T>. This means that Cel1<i32> must use a very 
different sharing predicate than i32, which just provides read-only access. In contrast, to verify set, 
we need temporary full access for the duration of the function call. However, it is also important 
that all shared references to a Cell are confined to a single thread, since the get and set operations 
are not thread-safe. Recall that Rust enforces this by declaring that Cell is not Sync, which is 


10Note that some simplifications of our setup make the proof of some of these libraries simpler. More precisely, we are 
not handling unwinding after panics, and all atomic memory operations are sequentially consistent, while Rust’s standard 
library uses weaker atomic accesses. 


Proceedings of the ACM on Programming Languages, Vol. 2, No. POPL, Article 66. Publication date: January 2018. 


RIGHTS Li Kip 


RustBelt: Securing the Foundations of the Rust Programming Language 66:27 


LFTL-BOR-NA LFTL-NA-ACC 


&i uP SK &klt p &ilt Ps [k], * [Na: ¢] =k >P x (»P =k [x], *[Na: tl) 
Fig. 6. Selected rules for non-atomic persistent borrows. 


equivalent to saying that &Ce11 is not Send, so that shared references to it cannot be sent to another 
thread—they must stay in the thread they have initially been created in. 

In order to encode this idea, we use non-atomic persistent borrows, another kind of borrow derived 
from the lifetime logic. Some of their rules are presented in Figure 6. Like fractured borrows, non- 
atomic persistent borrows are persistent, can be created from full borrows, and support shortening. 
However, the rule to access the borrows is different: LrrL-na-acc gives full access to the borrowed 
content, so it is important that concurrent threads not be allowed to access the same borrow 
simultaneously. To this end, the borrows depend on a thread identifier t. Accessing them requires 
a non-atomic token [Na : t] bound to that thread identifier. This token is created at the birth of 
the thread, and threaded through all of its control flow. That is, every function receives it and has 
to return it. The token is required to open a borrow, and not returned until the borrow is closed, 
making it impossible to open it twice at the same time. 

We can now use non-atomic persistent borrows to give the sharing predicate of cell(r): 


[cell(r)]] .shr(x, t, €) := ee (Bo. € + B * [r]-own(t, d)) 


Note, in particular, that our model of Cel1l<T> reflects the fact that it is never Sync, since its sharing 
predicate depends on the thread identifier t. However, if [z]].own(t, 0) does not depend on tf, then 
neither does [cell(z)]].own(t, 0)—just like in Rust, Cel1<T> is Send if and only if T is. 


6.2 Mutex 


Mutex is the other example of interior mutability that we presented in §2.5. Mutex<T> uses a lock 
to safely grant multiple threads read and write access to a shared object of type T. 
We start by giving its size and ownership predicate: 


[mutex(z)].size := 1 + [r].size [mutex(z)].own(t, 0) := [bool x z].own(t, 0) 


That is, when it is not shared, mutex(z) is exactly the same as a pair of a bool (representing the 
status of the lock''), and of an object of type rt (the content). 

The sharing predicate is more complex: It cannot use fractured borrows, because we cannot afford 
getting only a fraction of ownership, and it cannot use non-atomic persistent borrows, because 
mutexes are thread-safe. Instead, it uses yet another kind of borrow, atomic persistent borrows, whose 
rules can be found in the appendix [Jung et al. 2017a]. Again, they are distinguished from the other 
borrows in the rule granting access to the borrowed content. Here, the mechanism used to prevent 
two threads accessing the same borrow at the same time is atomicity: The proof rules enforce 
that an atomic persistent borrow cannot be opened for longer than a single, atomic instruction. 
Thus, during the execution of any given instruction, only one thread can be accessing the borrow. 
Returning to Mutex’s sharing predicate, the content of its borrow will only get accessed when 
changing the status of the lock, and doing so will require atomic memory accesses. Of course, this 
corresponds to the fact that, in our spinlock implementation, we are only using atomic sequentially 
consistent instructions to read or write the status flag. Using non-atomic accesses would lead to 
data races. 


The actual implementation in Rust uses the locking primitives of the operating system. We use our own spinlock-based 
implementation to model that. 
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Using atomic persistent borrows, we can give the sharing predicate for mutexes: 
[mutex(z)].shr(x, t, €) == Sk’.« C K’* 
& (Cts true Vv Cb false & (Jv. (€ +1) 0 * [r].own(t,))) 


This is quite a mouthful: First, we use an existential quantification at the beginning to close the 
predicate under shorter lifetimes, satisfying ry-sur-mono. We use an atomic persistent borrow to 
share ownership of the status flag at location ¢. This defines an invariant that is maintained until x 
ends. The invariant can be in one of two states: In the first state, the flag is true, in which case the 
lock is locked and no other resource is stored in the borrow. Ownership of the content is currently 
held by whichever thread acquired the lock. In the second state, the flag is false. This means the 
lock is unlocked, and the borrow also stores the ownership of the content at type 7 at location f+ 1. 
When acquiring or releasing the lock, we can atomically open the persistent borrow and change 
the branch of the disjunction, thus acquiring or releasing ownership of the content. 

Curiously, ownership of the content is wrapped in a full borrow. One might expect instead that 
it should be directly contained in the outer persistent borrow. In this case, acquiring the lock would 
result in acquiring full (unborrowed) ownership of the content of the mutex. That, however, does 
not work: Imagine x’ ends while the lock is held. (That is possible, for example, if the MutexGuard is 
leaked and hence its destructor never gets called.) In this case, ownership of the content would never 
be returned to the borrow. However, when x’ ends, the Mutex is again fully owned by someone, 
which means they expect to be the exclusive owner of the content! This is why the full borrow is 
necessary: When taking the lock, one gets the inner resource only under a borrow at lifetime x’, 
guaranteeing that ownership is returned when x’ ends. 

To conclude, observe that if the inner type does not depend on the thread identifier t (which 
corresponds to saying that it is Send), then neither [mutex(r)].own nor [mutex(r)].shr do, so 
that mutex(r) is both Send and Sync. This exactly corresponds to Rust’s behavior. 


7 PROOF OF SOUNDNESS 


Having defined the semantics of types in §4, we can finish up our formal development by defining 
semantic interpretations of the judgments presented in §3.3. We focus on the two most important 
ones: typing of instructions and typing of function bodies. Their interpretations use Hoare triples: 


T|E;L|T, EI=x.T, := 


Vy, t. {[Ely * [L], *« [Na : t] * [T:],@} I {v. [L], *« [Na : t] * [Tal yt< oi} 
Tr |E;L|K;T— F:= Vy,t. {[E]y * [L], * [Na: ¢] + [T])( * [K],(} F {True} 


In the preconditions, we can find the interpretations of the various contexts, together with the thread- 
local token [Na : t] used to access the non-atomic persistent borrows of thread t. The instruction 
judgment nicely demonstrates how this token is threaded through, alongside the local lifetime 
context [L], which contains all the lifetime tokens. The interpretation of the external lifetime 
context [E]], just involves the lifetime inclusion from the lifetime logic; this makes it persistent, so 
it does not have to be threaded through. Finally, [T],(t) uses the semantic interpretation of types 
as defined in the previous sections, tying them to the current thread t. 

The function judgment, on the other hand, has a trivial post-condition—remember that functions 
do not return, they call a continuation. The Hoare triple for that continuation is provided by 
[K],(t), and it will require [Na : t] as well as [L], in its precondition, which is how the tokens 
travel between functions. Using True as the trivial post-condition may be surprising; the more 
common choice for a CPS language is certainly False. However, for the adequacy theorem below, we 
want to talk about executing a full program, so we have to give a “halting continuation”. We could 
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of course make that continuation diverge, but instead we decided to make it return immediately. 
This has the benefit that the entire program actually terminates; however, it also means that we 
have to pick True as the post-condition of our continuations. 


Soundness. With the semantic judgments defined, we can state two core theorems showing 
the soundness of our type system. The first one shows a deep connection between the semantic 
judgments and their syntactic counterparts. 


THEOREM 7.1 (FUNDAMENTAL THEOREM OF LOGICAL RELATIONS). For any inference rule of the 
type system, when we replace allt by F, the resulting Iris theorem holds. 


One important corollary of the fundamental theorem is that if a judgment can be derived 
syntactically, then it also holds semantically. However, Theorem 7.1 is much stronger than this, 
because we can use it to glue together safe and unsafe code. Given a program that is syntactically 
well-typed except for certain components that are only semantically (but not syntactically) well- 
typed, the fundamental theorem tells us that the entire program is semantically well-typed. 

The second theorem is an adequacy theorem, relating the logical relation to program behavior: 


THEOREM 7.2 (ADEQUACY). Let f be aARust function such that 0 | 0;0| 0 = f=|x.x <fnQ > T[] 
holds. Then when we execute f with the default continuation (which is just a no-op), no execution ends 
in a stuck state. 


In particular, the adequacy theorem guarantees that a semantically well-typed program is memory 
and thread safe: It will never perform any invalid memory access and will not have data races. 

Put together, these theorems establish that, if the only code in a Agust program that is not syntacti- 
cally well-typed appears in semantically well-typed libraries, then the program is safe to execute. 


8 RELATED WORK 


Substructural type systems for state, and their soundness proofs. Over the past decades, 
numerous languages and type systems have been developed that use linear types [Wadler 1990], 
ownership [Clarke et al. 1998], and/or regions [Fluet et al. 2006] to guarantee safety of heap- 
manipulating programs. These include Cyclone [Jim et al. 2002], Vault [DeLine and Fahndrich 
2001], and Alms [Tov and Pucella 2011]. Much of this work has influenced the design of Rust, but 
a detailed discussion of that influence is beyond the scope of this paper. The key point for our 
purposes is that most such systems are closed-world, meaning that they are defined by a fixed set of 
rules and are proven sound using syntactic techniques [Wright and Felleisen 1994]. As explained in 
§1.1, Rust’s extensible type system fundamentally does not fit into this paradigm. 

In arelated but very different line of work, systems like Ynot [Nanevski et al. 2008], FCSL [Nanevski 
et al. 2014], and F* [Swamy et al. 2016] integrate variants of separation logic into dependent type 
theory. These systems are aimed at full functional verification of low-level imperative code and 
thus require a significant amount of manual proof and/or type annotations compared to Rust. 

Mezzo [Balabonski et al. 2016] can be placed somewhere between these two approaches. It 
comes with a substructural type system whose expressivity parallels that of a separation logic. Its 
soundness proof is modular in the sense that the authors start by verifying a core type system, 
and then add various extensions. This relies on an abstract notion of resources called monotonic 
separation algebras. Nevertheless, Mezzo’s handling of types remains entirely syntactic (e.g., based 
on the grammar of types); there is no semantic account for types that would permit “adding” new 
types without revisiting the proofs. 

We are only aware of a few substructural type systems for which soundness has been proven 
semantically (using logical relations). These include L? [Ahmed et al. 2007], AUR“ [Ahmed et al. 
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2005], and the “superficially substructural” type system of Krishnaswami et al. [2012]. Ahmed et 
al’s motivations for doing semantic soundness proofs were somewhat different from ours. One 
of their motivations was to build a foundation for substructural extensions to the Foundational 
Proof-Carrying Code project [Appel 2001]. Another was to make it possible to modularly extend 
soundness proofs when building up the features of a language incrementally (although it is worth 
noting that Balabonski et al. achieved similarly modular proofs for Mezzo using only syntactic 
methods). In contrast, following Krishnaswami et al. [2012], we are focused on building a soundness 
proof that is “extensible” along a different axis, namely the ability to verify soundness of libraries 
that extend Rust’s core type system through their use of unsafe features. Lastly, all of the prior 
semantic soundness proofs were done directly using set-theoretic step-indexed models, whereas in 
the present work, in order to model the complexities of Rust’s lifetimes and borrowing, we found it 
essential to work at the higher level of abstraction afforded by Iris and our lifetime logic. 

Cogent [Amani et al. 2016; O’Connor et al. 2016] is a purely functional, linearly typed language 
designed to implement file systems and verify their functional correctness. Its linear type system 
permits efficient compilation to machine code using in-place updates, while the purely functional 
semantics enables equational reasoning. Its design is such that missing functionality can be imple- 
mented in C functions (much like unsafe code in Rust), which are given types to enforce correct 
usage in the Cogent program. These C functions are then manually verified to implement an 
equational specification and to follow the guarantees of the type system. However, the language 
and the type system are much simpler than Rust’s (e.g., there is no support for recursion, iteration, 
borrowing, or mutable state). 

Rust’s concept of lifetimes has appeared before in the form of regions [Fahndrich and DeLine 
2002; Grossman et al. 2002]. The work by Fluet et al. [2006] on linear regions bears some similarity 
to the lifetime logic, with region capabilities corresponding to lifetime tokens and references 
corresponding to borrows. However, their approach does not rule out combining mutation with 
aliasing. This is not a problem because they consider neither deep pointers (where writing to one 
pointer can invalidate an aliasing pointer) nor concurrency. We believe that, to extend linear regions 
to handle Rust’s unique borrows, one would end up needing something akin to our lifetime logic. 

Formal results for Rust. Patina [Reed 2015] is a formalization of the Rust type system, with 
accompanying partial proofs of progress and preservation. Being syntactic, these proofs do not 
scale to account for unsafe code. To keep our formalization feasible, we did not reuse the syntax 
and type system of Patina, but rather designed Apust from scratch in a way that better fits Iris. 

CRUST [Toman et al. 2015] is a bounded model checker designed to verify the safety of Rust 
libraries implemented using unsafe code. It checks that all clients calling up to n library methods 
do not trigger memory safety faults. This provides an easy-to-use, automated way of checking 
unsafe code, before attempting a full formal proof. Their approach has successfully re-discovered 
some soundness bugs that had already been fixed in Rust’s standard library. However, by only 
considering one library at a time, it cannot find bugs that arise from the interaction of multiple 
libraries [Ben-Yehuda 2015c]. 

Concurrent separation logics. RustBelt builds on the Iris framework [Jung et al. 2017b], which 
in turn incorporates several great advances made in the past decade in the area of concurrent 
separation logics [Appel 2014; Dinsdale-Young et al. 2013, 2010; Dodds et al. 2009; Nanevski et al. 
2014; O’Hearn 2007; Svendsen and Birkedal 2014]. In particular, RustBelt depends crucially on 
Iris’s support for: (1) custom notions of logical resource (i.e., “fictional separation” [Jensen and 
Birkedal 2012]), which we use to model novel abstract predicates like the various forms of borrow 
propositions; (2) impredicative invariants [Svendsen and Birkedal 2014], which we use to model 
higher-order state; and (3) support for tactical proofs in Coq [Krebbers et al. 2017b], without which 
a verification of the scale and complexity of RustBelt would not be possible. 
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One recent innovation in separation logics is temporary read-only permissions [Charguéraud and 
Pottier 2017]. The authors introduce a duplicable “read-only” modality with rules that resemble 
ours for shared references at “simple” types like i32. However, since shared references permit 
interior mutability, the read-only permission is not suited to directly modeling shared references. 
Nevertheless, it would be interesting to explore whether this approach can facilitate the tracking of 
lifetime tokens, just like read-only permissions eliminate the bookkeeping involved in fractional 
permissions. One challenge here is that Arust supports non-lexical lifetimes [Matsakis 2016b], 
whereas read-only permissions are strictly lexical. 


9 CONCLUSION 


We have described Agust, a formal version of the Rust type system that we used to study Rust’s 
ownership discipline in the presence of unsafe code. We have shown that various important 
Rust libraries with unsafe implementations, many of them involving interior mutability, are safely 
encapsulated by their type. We had to make some concessions in our modeling: We do not model 
(1) more relaxed forms of atomic accesses, which Rust uses for efficiency in libraries like Arc; (2) 
Rust’s trait objects (comparable to interfaces in Java), which can pose safety issues due to their 
interactions with lifetimes; or (3) stack unwinding when a panic occurs, which causes issues similar 
to exception safety in C++ [Abrahams 1998]. We proved safety of the destructors of the verified 
libraries, but do not handle automatic destruction, which has already caused problems [Ben-Yehuda 
2015b] for which the Rust community still does not have a modular solution [Rust team 2016]. The 
remaining omissions are mostly unrelated to ownership, like proper support for type-polymorphic 
functions, and “unsized” types whose size is not statically known”. 

Despite these limitations, we believe we have captured the essence of Rust’s ownership discipline. 
The framework provided by the lifetime logic proved flexible enough to handle functions that 
are correct for subtle reasons, like Ref: :map and RefMut::map, part of RefCell, which had to 
have their signature changed from the initial design to ensure soundness [Sapin 2015]. In fact, 
our verification work resulted in uncovering and fixing a bug in Rust’s standard library [Jung 
2017], demonstrating that our model of Rust is realistic enough to be useful. Furthermore, our 
type system already handles features that are still being sketched for Rust itself, like non-lexical 
lifetimes [Matsakis 2016b], and we are in active discussion with the Rust community on these 
topics. 

In ongoing and future work, we plan to fill some of the gaps mentioned above and to bring Apust 
closer to MIR, the most important intermediate language in the Rust compiler. Concretely, we 
would like to make the fact that all local variables are heap-allocated more implicit, and to extend 
paths to include dereferencing a pointer. That should permit us to reduce the number of primitive 
instructions, making each of them correspond to exactly one construct in MIR. 
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12Notice that the Vec type, providing dynamically resizable arrays, is supported — though we have not implemented it. The 
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“unsized” types like [132] (an array of integers) independent of any pointer indirection; those types are not currently 
supported by our model. 
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